24 research outputs found

    Learning Agent for a Heat-Pump Thermostat With a Set-Back Strategy Using Model-Free Reinforcement Learning

    Full text link
    The conventional control paradigm for a heat pump with a less efficient auxiliary heating element is to keep its temperature set point constant during the day. This constant temperature set point ensures that the heat pump operates in its more efficient heat-pump mode and minimizes the risk of activating the less efficient auxiliary heating element. As an alternative to a constant set-point strategy, this paper proposes a learning agent for a thermostat with a set-back strategy. This set-back strategy relaxes the set-point temperature during convenient moments, e.g. when the occupants are not at home. Finding an optimal set-back strategy requires solving a sequential decision-making process under uncertainty, which presents two challenges. A first challenge is that for most residential buildings a description of the thermal characteristics of the building is unavailable and challenging to obtain. A second challenge is that the relevant information on the state, i.e. the building envelope, cannot be measured by the learning agent. In order to overcome these two challenges, our paper proposes an auto-encoder coupled with a batch reinforcement learning technique. The proposed approach is validated for two building types with different thermal characteristics for heating in the winter and cooling in the summer. The simulation results indicate that the proposed learning agent can reduce the energy consumption by 4-9% during 100 winter days and by 9-11% during 80 summer days compared to the conventional constant set-point strategyComment: Submitted to Energies - MDPI.co

    Residential Demand Response Using Reinforcement Learning: From Theory to Practice

    No full text
    The increasing share of renewable energy sources introduces the need for flexibility on the demand side of the electricity system. A prominent example of loads that offer flexibility at the residential level are thermostatically controlled loads, such as heat pumps, air conditioning units, and electric water heaters. Demand response programs harness demand flexibility by enabling consumers to adapt their electricity consumption profile in response to changes in the electricity price or other grid signals. The traditional control paradigm defines the demand response problem as a model-based control problem, requiring a model of the demand response application, an optimizer, and a forecasting technique. A critical step in setting up a model-based controller comprises selecting accurate models and estimating the model parameters. This step becomes even more challenging considering the heterogeneity of the end users and their different patterns of behavior. As a result, different end users are expected to have different model parameters and even different models. Building such a controller is considered a cumbersome endeavour, requiring custom expert knowledge, that has to be repeated for each load, making a large scale deployment of similar solutions challenging. Reinforcement Learning (RL), on the other hand, is a model-free technique that requires no a priori knowledge and considers its environment as a “black box”. RL techniques enable an agent to learn a control policy by interacting with its environment without the need to use modeling and system identification techniques. Inspired by the recent developments in batch RL, this work builds upon the existing batch RL literature and contributes to its application to residential demand response, opening the door for practical implementations: from theory to practice. This dissertation proposes a model-free approach to harness the flexibility of thermostatically controlled loads that is practical, cost-effective, self-adaptive and generally applicable.1 Introduction 2 Control strategies for residential demand response 3 Reinforcement learning 4 Demand response using batch reinforcement learning 5 Simulation results 6 Experimental results 7 Aggregated demand response 8 Conclusions and future researchnrpages: 162status: publishe

    Learning Agent for a Heat-Pump Thermostat with a Set-Back Strategy Using Model-Free Reinforcement Learning

    No full text
    The conventional control paradigm for a heat pump with a less efficient auxiliary heating element is to keep its temperature set point constant during the day. This constant temperature set point ensures that the heat pump operates in its more efficient heat-pump mode and minimizes the risk of activating the less efficient auxiliary heating element. As an alternative to a constant set-point strategy, this paper proposes a learning agent for a thermostat with a set-back strategy. This set-back strategy relaxes the set-point temperature during convenient moments, e.g., when the occupants are not at home. Finding an optimal set-back strategy requires solving a sequential decision-making process under uncertainty, which presents two challenges. The first challenge is that for most residential buildings, a description of the thermal characteristics of the building is unavailable and challenging to obtain. The second challenge is that the relevant information on the state, i.e., the building envelope, cannot be measured by the learning agent. In order to overcome these two challenges, our paper proposes an auto-encoder coupled with a batch reinforcement learning technique. The proposed approach is validated for two building types with different thermal characteristics for heating in the winter and cooling in the summer. The simulation results indicate that the proposed learning agent can reduce the energy consumption by 4%–9% during 100 winter days and by 9%–11% during 80 summer days compared to the conventional constant set-point strategy

    Using reinforcement learning for optimizing heat pump control in a building model in Modelica

    No full text
    status: publishe

    A Flexible Stochastic Optimization Method for Wind Power Balancing With PHEVs

    No full text
    This paper proposes a flexible optimization method, based on state of the art algorithms, for the smart control of plug-in hybrid electric vehicles (PHEVs) to balance wind power production. The problem is approached from the perspective of a balance responsible party (BRP) with a large share of wind power in its portfolio. The BRP uses controllable PHEVs to minimize the imbalance of its portfolio resulting from wind power forecast errors. A Markov Decision Process (MDP) formulation in combination with dynamic programming is used to solve the multistage stochastic problem. The main difficulty for applying MDPs to this problem is to efficiently include time interdependence of the wind power forecast error. In the presented approach, the probability distribution and time interdependence of the forecast error are represented by a scenario tree. Because of the MDP formulation, the algorithm is adaptable to deal with different transition models and constraints. This feature enables to use the algorithm in a dynamic environment such as the future smart grid. To demonstrate this, a generic charging model for PHEVs is used in the BRP wind balancing case. The flexibility of the algorithm is shown by investigating the solution for different degrees of complexity in the charging model.status: publishe

    Sequential Decision-Making Strategy for a Demand Response Aggregator in a Two-Settlement Electricity Market

    No full text
    This paper proposes a novel sequential decision-making strategy for a demand response aggregator that participates in the day-ahead market and reacts to imbalance prices. Finding such a participation strategy requires solving a multistage optimization problem under uncertainty that entails both an open-loop (day-ahead market) and a nested closed-loop (imbalance system) problem. Driven by the possibility of using data-driven models and reinforcement learning techniques, we formulate the problem as a Markov Decision Process (MDP). Standard MDP-based methods, however, often suffer from the curse of dimensionality. To address this challenge, we use techniques from approximate dynamic programming. Our proposed method applies a cross-entropy method with a simulation-based approximate policy iteration algorithm nested inside. The crossentropy method is compared with a separated planning method, that optimizes the day-ahead and real-time decisions separately. Both planning methods are evaluated for an aggregator with a fleet of electric vehicles using data from the Belgian electricity market.status: publishe
    corecore